63 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2023 Contribution to conference Open Access OPEN
OpenAIRE, comunità e servizi per praticare la scienza aperta
Pavone G., Atzori C., Baglioni M., Bardi A., Manghi P., Castelli D.
Per praticare la ricerca secondo i principi dell'Open Science sono al contempo necessarie tecnologie - con infrastrutture che consentano e facilitino la collaborazione e lo scambio massivo di informazioni su scala internazionale - e competenze che permettano di massimizzarne uso e risultati. In altre parole occorrono servizi, scambio di competenze e formazione. Su queste direttrici si concentra il lavoro di OpenAIRE (Open Access Infrastructure for Research in Europe), l'infrastruttura europea per la Scienza Aperta che offre servizi tecnologici e una rete europea di scambio e sinergia per favorire la scienza aperta. Avviata come progetto europeo nel 2009 per il monitoraggio dell'Open Access, nel corso degli anni l'iniziativa è stata rifinanziata e il suo ambito di interesse esteso a tutte le componenti dell'Open Science. Nel 2018 si è costituita come organizzazione senza scopo di lucro per garantire una struttura permanente a supporto delle politiche nazionali ed europee per l'Open Science. Il network di OpenAIRE conta oltre 40 membri tra centri di ricerca, università, fondazioni ed enti gestori di servizi distribuiti in tutta Europa. Come comunità di pratica, OpenAIRE ha la missione di costituire e gestire un'infrastruttura che supporti una comunicazione scientifica aperta e sostenibile, fornendo i servizi, le risorse e il coordinamento di iniziative ed esperti necessari per implementare un ambiente comune europeo per la scienza aperta. Per realizzare questa visione, OpenAIRE offre servizi tecnologici, di training e di supporto, coprendo l'intero ciclo di vita della ricerca (la lista completa dei servizi è consultabile su catalogue.openaire.eu). I servizi tecnologici spaziano dalla gestione dei dati al discovery, dalla gestione di riviste al monitoraggio dei risultati della ricerca e dell'adozione di pratiche Open Science. Inoltre la rete internazionale dei NOAD (National Open Access Desk: openaire.eu/contact-noads) promuove la scienza aperta fornendo assistenza e formazione a vari livelli. L'obiettivo è abilitare i vari attori coinvolti nell'attività scientifica nelle pratiche dell'open science e dell'open access organizzando workshop nazionali e training dedicati. I NOADs inoltre forniscono consulenza esperta sulle infrastrutture che supportano i flussi di lavoro per la scienza aperta, nonché per la definizione di politiche per la sua implementazione, quali stesura e aggiornamento di policies istituzionali, individuazione degli obblighi normativi, di adempimenti relativi ai finanziamenti o di strumenti per il Data Management Plan (DMP). Il CNR, in particolare il suo istituto ISTI, in qualità di centro di sviluppo e innovazione tecnologica dell'infrastruttura e di gestore del NOAD Italiano, opera in accordo con la missione di OpenAIRE contribuendo in modo significativo alle sue attività e agli organismi di governo. L'ente offre dunque le sue competenze per garantire il mantenimento, l'operatività e l'innovazione dell'infrastruttura partecipando in iniziative e progetti che contribuiscono alla sostenibilità e all'innovazione dei servizi di questa infrastruttura. Come NOAD, offre formazione e supporto per affrontare problematiche quali la definizione di DMP, il rispetto dei principi "FAIR" per la gestione dei dati, e la stesura di politiche istituzionali. Le attività sono portate avanti in collaborazione con i NOAD in altri paesi europei in modo da massimizzare l'integrazione di soluzioni e politiche a livello europeo.Source: GenoOA Week 2023, Genoa, Italy, 23-27/10/2023

See at: ISTI Repository Open Access | CNR ExploRA


2023 Contribution to conference Open Access OPEN
OpenAIRE Graph: una risorsa aperta per la scienza aperta
Atzori C., Bardi A., Baglioni M., Manghi P.
L'OpenAIRE Graph (OAG) è un knowledge graph costruito aggregando informazioni (metadati, relazioni) riguardo diverse entità del mondo della ricerca quali pubblicazioni, dataset, software ed altri prodotti, progetti finanziati, repository ed organizzazioni, interconnesse tra loro attraverso relazioni semantiche (e.g. citazioni, supplementi, similarità, partecipazione a progetti). L'OAG è una risorsa aperta che può essere utilizzata da enti finanziatori, organizzazioni, ricercatori, comunità di ricerca e editori per ottenere una migliore comprensione del panorama e delle dinamiche della ricerca a vari livelli, sia locale che globale. Trattandosi di una risorsa aperta e liberamente accessibile, prodotta rispettando i valori fondamentali dell'Open Science elaborati nella raccomandazione dell'UNESCO sulla Scienza Aperta, l'OAG permette di superare l'uso di sorgenti dati proprietarie supportando la riforma della valutazione della ricerca, dei ricercatori e delle organizzazioni previste dalla Coalition for Advancing Research Assessment (CoARA). L'OAG è costruito a partire da record bibliografici ottenuti da sorgenti note quali Crossref, le riviste open access registrate in DOAJ (Directory of Open Access Journals), ORCID, Microsoft Academic Graph, Datacite, cosi come da oltre 1000 repository istituzionali. I metadati dei prodotti della ricerca contenuti nel grafo sono disambiguati ed arricchiti grazie a processi di full text e data mining, questo rende l'OAG utilizzabile per una varietà di scopi, tra cui: research discovery, valutazione della ricerca, analisi e/o predizione delle collaborazioni di ricerca, supporto ai processi di decisione delle politiche di ricerca. L'OAG è una risorsa liberamente accessibile: le funzionalità di search & discovery sono disponibili attraverso il portale explore.openaire.eu, l'integrazione per via programmatica è disponibile attraverso le HTTP Search API, il dataset completo, così come altri dataset che offrono viste specializzate sono disponibili su Zenodo. Il portale monitor.openaire.eu ospita diverse dashboard dedicate ad organizzazioni di ricerca ed enti finanziatori che includono i risultati di analisi statistiche, bibliometriche, ed indicatori. Ulteriori informazioni sono disponibili su https://graph.openaire.eu, in cui sono descritti i modelli dati ai quali rispondono i dataset, la documentazione delle API, così come l'approccio metodologico utilizzato per la costruzione e l'elaborazione dell'OAG. A Luglio 2023 l'OAG include circa 170 milioni di pubblicazioni, 40 milioni di dataset, 110K research software ed oltre 3 miliardi di relazioni tra essi. Questo lo rende una delle più grandi raccolte di record accademici al mondo. Ha il potenziale di avere un impatto significativo sul modo in cui la ricerca viene condotta e comunicata. Rendendo più facile trovare, comprendere e utilizzare i dati di ricerca, l'OAG può aiutare a: accelerare la scoperta scientifica, migliorare la collaborazione in materia di ricerca, supportare le decisioni sulle politiche di ricerca, monitorare i progressi della ricerca, identificare le aree in cui sono necessari maggiori investimenti, aumentare la visibilità della ricerca nei paesi in via di sviluppo, supportare la riproducibilità della ricerca, promuovere le pratiche di open science. Per queste sue caratteristiche, l'OAG ha il potenziale per contribuire significativamente al progresso della scienza e della società.Source: GenoOA Week 2023, Genova, Italy e online, 23-27/10/2023

See at: ISTI Repository Open Access | CNR ExploRA


2023 Report Unknown
InfraScience research activity report 2023
Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bosio C., Bove P., Calanducci A., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., Ibrahim A. S. T., La Bruzzo S., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Molinaro E., Pagano P., Panichi G., Paratore M. T., Pavone G., Piccioli T., Sinibaldi F., Straccia U., Vannini G. L.
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2023 to highlight the major results. In particular, the InfraScience group engaged in research challenges characterising Data Infrastructures, e-Science, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2023 InfraScience members contributed to the publishing of several papers, to the research and development activities of several research projects (primarily funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual Reports, 2023
DOI: 10.32079/isti-ar-2023/002
Project(s): Blue Cloud via OpenAIRE, EOSC Future via OpenAIRE, TAILOR via OpenAIRE
Metrics:


See at: CNR ExploRA


2022 Conference article Open Access OPEN
A preliminary assessment of the article deduplication algorithm used for the OpenAIRE Research Graph
Vichos K., De Bonis M., Kanellos I., Chatzopoulos S., Atzori C., Manola N., Manghi P., Vergoulis T.
In recent years, a large number of Scholarly Knowledge Graphs (SKGs) have been introduced in the literature. The communities behind these graphs strive to gather, clean, and integrate scholarly metadata from various sources to produce clean and easy-to-process knowledge graphs. In this context, a very important task of the respective cleaning and integration workflows is deduplication. In this paper, we briefly describe and evaluate the accuracy of the deduplication algorithm used for the OpenAIRE Research Graph. Our experiments show that the algorithm has an adequate performance producing a small number of false positives and an even smaller number of false negatives.Source: IRCDL 2022 - 18th Italian Research Conference on Digital Libraries, Padua, Italy, 24-25/02/2022
Project(s): OpenAIRE Nexus via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA


2022 Report Open Access OPEN
InfraScience research activity report 2021
Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bove P., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., La Bruzzo S., Lazzeri E., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Ottonello E., Pagano P., Panichi G., Pavone G., Piccioli T., Sinibaldi F., Straccia U.
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2021 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, eScience, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2021 InfraScience members contributed to the publishing of 25 papers, to the research and development activities of 18 research projects (15 funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual report, 2022
DOI: 10.32079/isti-ar-2022/001
Project(s): ARIADNEplus via OpenAIRE, Blue Cloud via OpenAIRE, PerformFISH via OpenAIRE, EOSC-Pillar via OpenAIRE, DESIRA via OpenAIRE, EOSC Future via OpenAIRE, EOSCsecretariat.eu via OpenAIRE, EcoScope via OpenAIRE, RISIS 2 via OpenAIRE, OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2022 Journal article Open Access OPEN
FDup: a framework for general-purpose and efficient entity deduplication of record collections
De Bonis M., Manghi P., Atzori C.
Deduplication is a technique aiming at identifying and resolving duplicate metadata records in a collection. This article describes FDup (Flat Collections Deduper), a general-purpose software framework supporting a complete deduplication workflow to manage big data record collections: metadata record data model definition, identification of candidate duplicates, identification of duplicates. FDup brings two main innovations: first, it delivers a full deduplication framework in a single easy-to-use software package based on Apache Spark Hadoop framework, where developers can customize the optimal and parallel workflow steps of blocking, sliding windows, and similarity matching function via an intuitive configuration file; second, it introduces a novel approach to improve performance, beyond the known techniques of "blocking" and "sliding window", by introducing a smart similarity matching function T-match. T-match is engineered as a decision tree that drives the comparisons of the fields of two records as branches of predicates and allows for successful or unsuccessful early-exit strategies. The efficacy of the approach is proved by experiments performed over big data collections of metadata records in the OpenAIRE Research Graph, a known open access knowledge base in Scholarly communication.Source: PeerJ Computer Science 8 (2022). doi:10.7717/PEERJ-CS.1058
DOI: 10.7717/peerj-cs.1058
Project(s): OpenAIRE Nexus via OpenAIRE
Metrics:


See at: OpenAIRE Open Access | ISTI Repository Open Access | peerj.com Open Access | CNR ExploRA


2022 Software Unknown
dnet-dedup framework
Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Dell'Amico A., La Bruzzo S. F., Mannocci A., Manghi P.
The GDup Software enables an integrated, scalable, general-purpose system for entity deduplication over big information graphs. GDup supports practitioners with the functionalities needed to realize a fully-fledged entity deduplication workflow over a generic input graph, including Ground Truth support, end-user feedback, and strategies for identifying and merging duplicates to obtain an output disambiguated graph. GDup is today one of the core components of the OpenAIRE infrastructure production system, monitoring Open Science trends on behalf of the European Commission.Project(s): OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE

See at: github.com | CNR ExploRA


2022 Report Open Access OPEN
Data model description of the OpenAIRE Research Graph
La Bruzzo S. F., Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Mannocci A., Manghi P., Pavone G.
The OpenAIRE Graph (formerly known as the OpenAIRE Research Graph) is one of the largest open scholarly record collections worldwide, key to fostering Open Science and establishing its practices in daily research activities. Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back into the hands of the scientific community. Imagine a vast collection of research products all linked together, contextualized, and openly available. For the past years, OpenAIRE has been working to gather this valuable record. It is a massive collection of metadata and links between scientific products such as articles, datasets, software, and other research products, entities like organizations, funders, funding streams, projects, communities, and data sources. This technical Report describes the public data model adopted by the OpenAIRE Graph.Source: ISTI Technical Report, ISTI-2022-TR/031, 2022
DOI: 10.32079/isti-tr-2022/031
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2022 Report Open Access OPEN
OpenAIRE Research Graph: aggregation workflow
La Bruzzo S. F., Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Dell'Amico A., Mannocci A., Manghi P., Pavone G.
The OpenAIRE Graph (formerly the OpenAIRE Research Graph) is one of the largest open scholarly record collections worldwide. It is key in fostering Open Science and establishing its practices in daily research activities. Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back into the hands of the scientific community. OpenAIRE collects metadata records from more than 70K scholarly communication sources worldwide, including Open Access institutional repositories, data archives, and journals. All the metadata records (i.e., descriptions of research products) are put together in a data lake with records from Crossref, Unpaywall, ORCID, ROR, and information about projects provided by national and international funders. This technical Report describes the main Aggregation Workflow to orchestrate the data aggregation and the implemented mapping from some of the main datasources into the OpenAIRE research graph data model.Source: ISTI Technical Report, ISTI-2022-TR/033, 2022
DOI: 10.32079/isti-tr-2022/033
Project(s): OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2022 Report Open Access OPEN
OpenAIRE Research Graph deduplication workflow
La Bruzzo S. F., Artini M., Atzori C., Bardi A., Baglioni M., De Bonis M., Mannocci A., Manghi P., Pavone G.
The OpenAIRE aggregation workflow can collect metadata records from different providers about the same scholarly work. Each metadata record can carry different information because, for example, some providers are not aware of links to projects, keywords, or other details. Another typical case is when OpenAIRE collects one metadata record from a repository about a pre-print and another from a journal about the published article. To provide correct statistics, OpenAIRE must identify those cases and "merge" the two metadata records so that the scholarly work is counted only once in the statistics OpenAIRE produces. This technical Report describes the Deduplication workflow and technique adopted to deduplicate the OpenAIRE Graph.Source: ISTI Technical Report, ISTI-2022-TR/032, 2022
DOI: 10.32079/isti-tr-2022/032
Project(s): OpenAIRE-Connect via OpenAIRE, OpenAIRE Nexus via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2022 Report Open Access OPEN
InfraScience research activity report 2022
Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Bove P., Candela L., Casini G., Castelli D., Cirillo R., Coro G., De Bonis M., Debole F., Dell'Amico A., Frosini L., La Bruzzo S., Lelii L., Manghi P., Mangiacrapa F., Mangione D., Mannocci A., Ottonello E., Pagano P., Panichi G., Pavone G., Piccioli T., Sinibaldi F., Straccia U., Zoppi F.
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2022 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, e-Science, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, i.e. D4Science and OpenAIRE. During 2022 InfraScience members contributed to the publishing of several papers, to the research and development activities of 18 research projects (15 funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual reports, 2022
DOI: 10.32079/isti-ar-2022/004
Project(s): ARIADNEplus via OpenAIRE, Blue Cloud via OpenAIRE, EOSC-Pillar via OpenAIRE, DESIRA via OpenAIRE, EOSC Future via OpenAIRE, RISIS 2 via OpenAIRE, TAILOR via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2022 Other Open Access OPEN
RISIS tool demonstration event - The OpenAIRE Research Graph: an Open Access resource for research on research
Bardi A., Baglioni M., Atzori C.
RISIS embraces the International Open Access Week 2022 with a session on the OpenAIRE Research Graph: an Open Access dataset with metadata about research products (literature, datasets, software, etc.) linked to other entities of the research ecosystem like organisations, project grants, data sources, and services. The session included a presentation of the graph and a guided practical session where participants can learn how to use the OpenAIRE Research Graph for research and policy-related activities. More information about the event is available on the RISIS2 project web site. The practical part has been conducted on the RISIS Lab Virtual Research Environment of the D4Science infrastructure operated by CNR - ISTI. The Jupyter notebooks can be run on the JupyterHub integrated in the RISIS Lab or in other JupyterHub instances supporting PySpark. The data analysis was performed on a subset of the OpenAIRE Research Graph composed of 848 H2020 projects related to the Sustainable Development Goal Climate Action (SDG13), their funded research products, and their related organizations (risis_dataset.zip). Details on the subset, the model, and other useful documentation is available in the slides.Project(s): RISIS 2 via OpenAIRE, OpenAIRE Nexus via OpenAIRE

See at: ISTI Repository Open Access | ISTI Repository Open Access | CNR ExploRA


2021 Conference article Open Access OPEN
Reflections on the misuses of ORCID iDs
Baglioni M., Mannocci A., Manghi P., Atzori C., Bardi A., La Bruzzo S.
Since 2012, the "Open Researcher and Contributor Identification Initiative" (ORCID) has been successfully running a worldwide registry, with the aim of unequivocally pinpoint researchers and the body of knowledge they contributed to. In practice, ORCID clients, e.g., publishers, repositories, and CRIS systems, make sure their metadata can refer to iDs in the ORCID registry to associate authors and their work unambiguously. However, the ORCID infrastructure still suffers from several "service misuses", which put at risk its very mission and should be therefore identified and tackled. In this paper, we classify and qualitatively document such misuses, occurring from both users (researchers and organisations) of the ORCID registry and the ORCID clients. We conclude providing an outlook and a few recommendations aiming at improving the exploitation of the ORCID infrastructure.Source: IRCDL 2021 - 17th Italian Research Conference on Digital Libraries, pp. 117–125, Online conference, 18-19/02/2021
Project(s): OpenAIRE-Advance via OpenAIRE

See at: ceur-ws.org Open Access | ISTI Repository Open Access | CNR ExploRA


2021 Conference article Open Access OPEN
BIP! DB: a dataset of impact measures for scientific publications
Vergoulis T., Kanellos I., Atzori C., Mannocci A., Chatzopoulos S., La Bruzzo S., Manola N., Manghi P.
The growth rate of the number of scientific publications is constantly increasing, creating important challenges in the identification of valuable research and in various scholarly data management applications, in general. In this context, measures which can effectively quantify the scientific impact could be invaluable. In this work, we present BIP! DB, an open dataset that contains a variety of impact measures calculated for a large collection of more than 100 million scientific publications from various disciplines.Source: WWW 2021 - Companion of the World Wide Web Conference, pp. 456–460, Online conference, 13/04/2021
DOI: 10.1145/3442442.3451369
DOI: 10.48550/arxiv.2101.12001
Project(s): OpenAIRE-Advance via OpenAIRE, OpenAIRE Nexus via OpenAIRE
Metrics:


See at: arXiv.org e-Print Archive Open Access | arxiv.org Open Access | ISTI Repository Open Access | dl.acm.org Restricted | doi.org Restricted | doi.org Restricted | CNR ExploRA


2021 Dataset Unknown
OpenAIRE research graph: dumps for research communities and initiatives
Manghi P., Atzori C., Bardi A., Baglioni M., Schirrwagen J., Dimitropoulos H., La Bruzzo S., Foufoulas I., Lohden A., Backer A., Mannocci A., Horst M., Czerniak A., Kiatropoulou K., Kokogiannaki A., De Bonis M., Artini M., Ottonello E., Lempesis A., Ioannidis A., Summan F.
This dataset contains dumps of the OpenAIRE Research Graph containing metadata records relevant for the research communities and initiatives collaborating with OpenAIRE. Each dataset is a tar file containing gzip files with one json per line. Each json is compliant to the schema available at DOI: 10.5281/zenodo.3974226DOI: 10.5281/zenodo.3974604
Project(s): RISIS 2 via OpenAIRE, BE OPEN via OpenAIRE, OpenAIRE-Advance via OpenAIRE
Metrics:


See at: CNR ExploRA


2021 Dataset Unknown
OpenAIRE Covid-19 publications, datasets, software and projects metadata
Bardi A., Kuchma I., Pavone G., Artini M., Atzori C., Backer A., Baglioni M., Czerniak A., De Bonis M., Dimitropoulos H., Foufoulas I., Horst M., Iatropoulou K., Jacewicz P., Kokogiannaki A., La Bruzzo S., Lazzeri E., Lohden A., Manghi P., Mannocci A., Manola N., Ottonello E., Schirrwagen J.
This dump provides access to the metadata records of publications, research data, software and projects that may be relevant to the Corona Virus Disease (COVID-19) fight. The dump contains records of the OpenAIRE COVID-19 Gateway (https://covid-19.openaire.eu/), identified via full-text mining and inference techniques applied to the OpenAIRE Research Graph (https://explore.openaire.eu/). The Graph is one of the largest Open Access collections of metadata records and links between publications, datasets, software, projects, funders, and organizations, aggregating 12,000+ scientific data sources world-wide, among which the Covid-19 data sources Zenodo COVID-19 Community, WHO (World Health Organization), BIP! FInder for COVID-19, Protein Data Bank, Dimensions, scienceOpen, and RSNA. The dump consists of a gzip file containing one json per line. Each json is compliant to the schema available at https://doi.org/10.5281/zenodo.3974226DOI: 10.5281/zenodo.3980490
Project(s): OpenAIRE-Advance via OpenAIRE
Metrics:


See at: CNR ExploRA


2021 Report Open Access OPEN
InfraScience Research Activity Report 2020
Artini M., Assante M., Atzori C., Baglioni M., Bardi A., Candela L., Casini G., Castelli D., Cirillo R., Coro G., Debole F., Dell'Amico A., Frosini L., La Bruzzo S., Lazzeri E., Lelii L., Manghi P., Mangiacrapa F., Mannocci A., Pagano P., Panichi G., Piccioli T., Sinibaldi F., Straccia U.
InfraScience is a research group of the National Research Council of Italy - Institute of Information Science and Technologies (CNR - ISTI) based in Pisa, Italy. This report documents the research activity performed by this group in 2020 to highlight the major results. In particular, the InfraScience group confronted with research challenges characterising Data Infrastructures, e\-Sci\-ence, and Intelligent Systems. The group activity is pursued by closely connecting research and development and by promoting and supporting open science. In fact, the group is leading the development of two large scale infrastructures for Open Science, \ie D4Science and OpenAIRE. During 2020 InfraScience members contributed to the publishing of 30 papers, to the research and development activities of 12 research projects (11 funded by EU), to the organization of conferences and training events, to several working groups and task forces.Source: ISTI Annual Report, ISTI-2021-AR/002, pp.1–20, 2021
DOI: 10.32079/isti-ar-2021/002
Project(s): ARIADNEplus via OpenAIRE, Blue Cloud via OpenAIRE, PerformFISH via OpenAIRE, EOSC-Pillar via OpenAIRE, DESIRA via OpenAIRE, EOSCsecretariat.eu via OpenAIRE, RISIS 2 via OpenAIRE, TAILOR via OpenAIRE, I-GENE via OpenAIRE, MOVING via OpenAIRE, OpenAIRE-Advance via OpenAIRE, SoBigData-PlusPlus via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | CNR ExploRA


2021 Contribution to conference Open Access OPEN
OpenOrgs: bridging registries of research organizations. Supporting disambiguation and improving the quality of data
Pavone G., Atzori C.
This presentation was given for OpenAIRE Tech Clinic webinar on 21 June 2021, focusing on the OpenOrgs tool. Unambiguously identifying organizations involved in the research work may not be a trivial task. Their names can be derived from various data sources, each of which often contains a different version of the organization's name (full legal name, short or alternative names, acronym, and so on) and different metadata fields. In OpenOrgs, data curators can enrich the metadata description of organizations and resolve the ambiguity of duplicates detected with an automated process by stating whether two or more entities correspond or not to the same organization. With these tasks, OpenOrgs users can compensate for the lack of information available and improve the organizations' discoverability.Source: OpenAIRE Tech Clinic webinar, Online event, 21 June 2021
DOI: 10.5281/zenodo.5101096
Project(s): OpenAIRE Nexus via OpenAIRE
Metrics:


See at: ISTI Repository Open Access | zenodo.org Open Access | CNR ExploRA


2020 Journal article Open Access OPEN
Entity deduplication in big data graphs for scholarly communication
Manghi P., Atzori C., De Bonis M., Bardi A.
Purpose: Several online services offer functionalities to access information from "big research graphs" (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate scholarly/scientific communication entities such as publications, authors, datasets, organizations, projects, funders, etc. Depending on the target users, access can vary from search and browse content to the consumption of statistics for monitoring and provision of feedback. Such graphs are populated over time as aggregations of multiple sources and therefore suffer from major entity-duplication problems. Although deduplication of graphs is a known and actual problem, existing solutions are dedicated to specific scenarios, operate on flat collections, local topology-drive challenges and cannot therefore be re-used in other contexts. Design/methodology/approach: This work presents GDup, an integrated, scalable, general-purpose system that can be customized to address deduplication over arbitrary large information graphs. The paper presents its high-level architecture, its implementation as a service used within the OpenAIRE infrastructure system and reports numbers of real-case experiments. Findings: GDup provides the functionalities required to deliver a fully-fledged entity deduplication workflow over a generic input graph. The system offers out-of-the-box Ground Truth management, acquisition of feedback from data curators and algorithms for identifying and merging duplicates, to obtain an output disambiguated graph. Originality/value: To our knowledge GDup is the only system in the literature that offers an integrated and general-purpose solution for the deduplication graphs, while targeting big data scalability issues. GDup is today one of the key modules of the OpenAIRE infrastructure production system, which monitors Open Science trends on behalf of the European Commission, National funders and institutions.Source: Data technologies and applications 54 (2020): 409–435. doi:10.1108/DTA-09-2019-0163
DOI: 10.1108/dta-09-2019-0163
Project(s): OpenAIRE2020 via OpenAIRE, OpenAIRE-Advance via OpenAIRE
Metrics:


See at: Data Technologies and Applications Open Access | ISTI Repository Open Access | www.emerald.com Open Access | Data Technologies and Applications Open Access | CNR ExploRA


2019 Report Open Access OPEN
The OpenAIRE research graph: third-party publishing APIs
Atzori C., Baglioni M., Bardi A., Manghi P., La Bruzzo S., De Bonis M., Dell'Amico A., Artini M., Mannocci A., Ottonello E.
This work describes the specification of the OpenAIRE publishing APIs that support third-party services at publishing metadata about interlinked and packaged research products into the OpenAIRE Research Graph, in respect of the OpenAIRE interoperability guidelines (https://guidelines.openaire.eu). Research products generated by researchers using services of research infrastructures are today manually published by researchers in a repository external to their research infrastructure. This phase is often considered an extra burden, because researchers have to fill in metadata forms with information that is already available in the scope of the services they used. By using the OpenAIRE publishing APIs, services of research infrastructures can implement an on-demand publishing workflow for any type of research products to support their researchers at improving the FAIRness of their research products and relief them from the tedious step of finding a suitable repository and manually depositing the products in it.Source: ISTI Technical reports, 2019

See at: ISTI Repository Open Access | CNR ExploRA